A Scaling Law for the Validation-set Training-set Size Ratio

نویسنده

  • Isabelle Guyon
چکیده

We address the problem of determining what fraction of the training set should be reserved as development test set or validation set. We determine that the ratio of the validation set size over the training set size scales like the square root of two complexity parameters: the complexity of the second level of inference (minimizing the validation error) over the complexity of the rst level of inference (minimizing the error rate on the training set).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Deep Learning Scaling is Predictable, Empirically

Deep learning (DL) creates impactful advances following a virtuous recipe: model architecture search, creating large training data sets, and scaling computation. It is widely believed that growing training sets and models should improve accuracy and result in better products. As DL application domains grow, we would like a deeper understanding of the relationships between training set size, com...

متن کامل

Artificial Neural Network Modeling for Predicting of some Ion Concentrations in the Karaj River

The water quality of the Karaj River was studied through collecting 2137 experimental data set gained by 20 sampling stations. The data included different parameters such as T (temperature), pH, NTU (turbidity), hardness, TDS (total dissolved solids), EC (electrical conductivity) and basic anion, cation concentrations. In this study a multi-layer perceptron artificial neural network model was d...

متن کامل

Effect of marker density and trait heritability on the accuracy of genomic prediction over three generations

The aim of this study was to determine the effect of marker density, level of heritability, number of QTLs, and size of training set on the genomic accuracy over three generations. Thereby, a trait was simulated with heritability of 0.10, 0.25 or 0.40. For each animal, a genome with 20 chromosomes, 1 Morgan each, was simulated. Different marker densities (2000, 4000 and 6000 markers) and 400 an...

متن کامل

اهمیت خویشاوندی ژنتیکی و رکورد فنوتیپی بر صحت ژنومی داده‌های جانهی شبیه‌ سازی شده با استفاده از مدل های حیوانی در حضور اثرات متقابل ژنوتیپ و محیط

The objective of this study was to investigate the role of genetic relationships between training and validation set with considering different ratio of phenotypic records of training set on accuracy of genomic prediction via animal models containing genotype × environment interactions in simulated imputation data. For this purpose, four different scenarios using 15k density containing differen...

متن کامل

Validation of Optimum ROI Size for 123I-FP-CIT SPECT Imaging Using a 3D Mathematical Cylinder Phantom

Objective(s): The partial volume effect (PVE) of single-photon emission computed tomography (SPECT) on corpus striatum imaging is caused by the underestimation of specific binding ratio (SBR). A large ROI (region of interest) set using the Southampton method is independent of PVE for SBR. The present study aimed to determine the optimal ROI size with contrast and SBR for striatum images and val...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997